29 research outputs found
The Voice of Optimization
We introduce the idea that using optimal classification trees (OCTs) and
optimal classification trees with-hyperplanes (OCT-Hs), interpretable machine
learning algorithms developed by Bertsimas and Dunn [2017, 2018], we are able
to obtain insight on the strategy behind the optimal solution in continuous and
mixed-integer convex optimization problem as a function of key parameters that
affect the problem. In this way, optimization is not a black box anymore.
Instead, we redefine optimization as a multiclass classification problem where
the predictor gives insights on the logic behind the optimal solution. In other
words, OCTs and OCT-Hs give optimization a voice. We show on several realistic
examples that the accuracy behind our method is in the 90%-100% range, while
even when the predictions are not correct, the degree of suboptimality or
infeasibility is very low. We compare optimal strategy predictions of OCTs and
OCT-Hs and feedforward neural networks (NNs) and conclude that the performance
of OCT-Hs and NNs is comparable. OCTs are somewhat weaker but often
competitive. Therefore, our approach provides a novel insightful understanding
of optimal strategies to solve a broad class of continuous and mixed-integer
optimization problems
Online Mixed-Integer Optimization in Milliseconds
We propose a method to solve online mixed-integer optimization (MIO) problems
at very high speed using machine learning. By exploiting the repetitive nature
of online optimization, we are able to greatly speedup the solution time. Our
approach encodes the optimal solution into a small amount of information
denoted as strategy using the Voice of Optimization framework proposed in
[BS21]. In this way the core part of the optimization algorithm becomes a
multiclass classification problem which can be solved very quickly. In this
work, we extend that framework to real-time and high-speed applications
focusing on parametric mixed-integer quadratic optimization (MIQO). We propose
an extremely fast online optimization algorithm consisting of a feedforward
neural network (NN) evaluation and a linear system solution where the matrix
has already been factorized. Therefore, this online approach does not require
any solver nor iterative algorithm. We show the speed of the proposed method
both in terms of total computations required and measured execution time. We
estimate the number of floating point operations (flops) required to completely
recover the optimal solution as a function of the problem dimensions. Compared
to state-of-the-art MIO routines, the online running time of our method is very
predictable and can be lower than a single matrix factorization time. We
benchmark our method against the state-of-the-art solver Gurobi obtaining from
two to three orders of magnitude speedups on examples from fuel cell energy
management, sparse portfolio optimization and motion planning with obstacle
avoidance
Equitable Data-Driven Resource Allocation to Fight the Opioid Epidemic: A Mixed-Integer Optimization Approach
The opioid epidemic is a crisis that has plagued the United States (US) for
decades. One central issue of the epidemic is inequitable access to treatment
for opioid use disorder (OUD), which puts certain populations at a higher risk
of opioid overdose. We integrate a predictive dynamical model and a
prescriptive optimization problem to compute high-quality opioid treatment
facility and treatment budget allocations for each US state. Our predictive
model is a differential equation-based epidemiological model that captures the
dynamics of the opioid epidemic. We use neural ordinary differential equations
to fit this model to opioid epidemic data for each state and obtain estimates
for unknown parameters in the model. We then incorporate this epidemiological
model into a corresponding mixed-integer optimization problem (MIP) that aims
to minimize the number of opioid overdose deaths and the number of people with
OUD. We develop strong relaxations based on McCormick envelopes to efficiently
compute approximate solutions to our MIPs that have less than 1% optimality
gaps. Our method provides socioeconomically equitable solutions, as it
incentivizes investments in areas with higher social vulnerability (from the US
Centers for Disease Control's Social Vulnerability Index) and opioid
prescribing rates. On average, our approach decreases the number of people with
OUD by 6.08 0.863%, increases the number of people in treatment by 22.57
3.633%, and decreases the number of opioid-related deaths by 0.55
0.105% after 2 years compared to the baseline epidemiological model's
predictions. We identify that treatment facilities should be moved or added to
counties that have significantly less facilities than their population share
and higher social vulnerability. Future iterations of our approach could be
implemented as a decision-making tool to tackle opioid treatment
inaccessibility
OSQP: An Operator Splitting Solver for Quadratic Programs
We present a general-purpose solver for convex quadratic programs based on
the alternating direction method of multipliers, employing a novel operator
splitting technique that requires the solution of a quasi-definite linear
system with the same coefficient matrix at almost every iteration. Our
algorithm is very robust, placing no requirements on the problem data such as
positive definiteness of the objective function or linear independence of the
constraint functions. It can be configured to be division-free once an initial
matrix factorization is carried out, making it suitable for real-time
applications in embedded systems. In addition, our technique is the first
operator splitting method for quadratic programs able to reliably detect primal
and dual infeasible problems from the algorithm iterates. The method also
supports factorization caching and warm starting, making it particularly
efficient when solving parametrized problems arising in finance, control, and
machine learning. Our open-source C implementation OSQP has a small footprint,
is library-free, and has been extensively tested on many problem instances from
a wide variety of application areas. It is typically ten times faster than
competing interior-point methods, and sometimes much more when factorization
caching or warm start is used. OSQP has already shown a large impact with tens
of thousands of users both in academia and in large corporations
Learning to Warm-Start Fixed-Point Optimization Algorithms
We introduce a machine-learning framework to warm-start fixed-point
optimization algorithms. Our architecture consists of a neural network mapping
problem parameters to warm starts, followed by a predefined number of
fixed-point iterations. We propose two loss functions designed to either
minimize the fixed-point residual or the distance to a ground truth solution.
In this way, the neural network predicts warm starts with the end-to-end goal
of minimizing the downstream loss. An important feature of our architecture is
its flexibility, in that it can predict a warm start for fixed-point algorithms
run for any number of steps, without being limited to the number of steps it
has been trained on. We provide PAC-Bayes generalization bounds on unseen data
for common classes of fixed-point operators: contractive, linearly convergent,
and averaged. Applying this framework to well-known applications in control,
statistics, and signal processing, we observe a significant reduction in the
number of iterations and solution time required to solve these problems,
through learned warm starts
Mean Robust Optimization
Robust optimization is a tractable and expressive technique for
decision-making under uncertainty, but it can lead to overly conservative
decisions when pessimistic assumptions are made on the uncertain parameters.
Wasserstein distributionally robust optimization can reduce conservatism by
being data-driven, but it often leads to very large problems with prohibitive
solution times. We introduce mean robust optimization, a general framework that
combines the best of both worlds by providing a trade-off between computational
effort and conservatism. We propose uncertainty sets constructed based on
clustered data rather than on observed data points directly thereby
significantly reducing problem size. By varying the number of clusters, our
method bridges between robust and Wasserstein distributionally robust
optimization. We show finite-sample performance guarantees and explicitly
control the potential additional pessimism introduced by any clustering
procedure. In addition, we prove conditions for which, when the uncertainty
enters linearly in the constraints, clustering does not affect the optimal
solution. We illustrate the efficiency and performance preservation of our
method on several numerical examples, obtaining multiple orders of magnitude
speedups in solution time with little-to-no effect on the solution quality
Learning for Robust Optimization
We propose a data-driven technique to automatically learn the uncertainty
sets in robust optimization. Our method reshapes the uncertainty sets by
minimizing the expected performance across a family of problems while
guaranteeing constraint satisfaction. We learn the uncertainty sets using a
novel stochastic augmented Lagrangian method that relies on differentiating the
solutions of the robust optimization problems with respect to the parameters of
the uncertainty set. We show sublinear convergence to stationary points under
mild assumptions, and finite-sample probabilistic guarantees of constraint
satisfaction using empirical process theory. Our approach is very flexible and
can learn a wide variety of uncertainty sets while preserving tractability.
Numerical experiments show that our method outperforms traditional approaches
in robust and distributionally robust optimization in terms of out of sample
performance and constraint satisfaction guarantees. We implemented our method
in the open-source package LROPT
Learning Rationality in Potential Games
We propose a stochastic first-order algorithm to learn the rationality
parameters of simultaneous and non-cooperative potential games, i.e., the
parameters of the agents' optimization problems. Our technique combines (i.) an
active-set step that enforces that the agents play at a Nash equilibrium and
(ii.) an implicit-differentiation step to update the estimates of the
rationality parameters. We detail the convergence properties of our algorithm
and perform numerical experiments on Cournot and congestion games, showing that
our algorithm effectively finds high-quality solutions (in terms of
out-of-sample loss) and scales to large datasets